Genomic sweeping for hypermethylated genes

نویسندگان

  • Liang Goh
  • Susan K. Murphy
  • Sayan Mukherjee
  • Terrence S. Furey
چکیده

MOTIVATION Genes silenced by the aberrent methylation of nearby CpG islands can contribute to the onset or progression of cancer and represent potential biomarkers for diagnosis and prognosis. Relatively few have thus far been validated as hypermethylated in cancer among over 14,000 candidates with promoter region CpG islands. A descriptive set of genes known to be unmethylated in cancer does not exist. This lack of a negative set and a large number of candidates necessitated the development of a new approach to identify novel genes hypermethylated in cancer. RESULTS We developed a general method, cluster_boost, that in an imbalanced data setting predicts new minority class members given limited known samples and a large set of unlabeled samples. Synthetic datasets modeled after the hypermethylated genes data show that cluster_boost can successfully identify minority samples within unlabeled data. Using genome sequence features, cluster_boost predicted candidate hypermethylated genes among 14,000 genes of unknown status. In primary ovarian cancers, we determined the methylation status for 15 genes with different levels of support for being hypermethlyated. Results indicate cluster_boost can accurately identify novel genes hypermethylated in cancer. AVAILABILITY Software and datasets are freely available at http://labs.genome.duke.edu/FureyLab/cluster_boost.php. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

اپی‌ژنتیک سرطان پستان: مقاله مروری

Stable molecular changes during cell division without any change in the sequence of DNA molecules is known as epigenetic. Molecular mechanisms involved in this process, including histone modifications, methylation of DNA, protein complex and RNA antisense. Cancer genome changes happen through a combination of DNA hypermethylation, long-term epigenetic silencing with heterozygosis loss and genom...

متن کامل

Hypermethylated SUPERMAN epigenetic alleles in arabidopsis.

Mutations in the SUPERMAN gene affect flower development in Arabidopsis. Seven heritable but unstable sup epi-alleles (the clark kent alleles) are associated with nearly identical patterns of excess cytosine methylation within the SUP gene and a decreased level of SUP RNA. Revertants of these alleles are largely demethylated at the SUP locus and have restored levels of SUP RNA. A transgenic Ara...

متن کامل

Reduced Representation Bisulfite Sequencing Determination of Distinctive DNA Hypermethylated Genes in the Progression to Colon Cancer in African Americans

Background and Aims. Many studies have focused on the determination of methylated targets in colorectal cancer. However, few analyzed the progressive methylation in the sequence from normal to adenoma and ultimately to malignant tumors. This is of utmost importance especially in populations such as African Americans who generally display aggressive tumors at diagnosis and for whom markers of ea...

متن کامل

Methyl-CpG binding proteins identify novel sites of epigenetic inactivation in human cancer.

Methyl-CpG binding proteins (MBDs) mediate histone deacetylase-dependent transcriptional silencing at methylated CpG islands. Using chromatin immunoprecitation (ChIP) we have found that gene-specific profiles of MBDs exist for hypermethylated promoters of breast cancer cells, whilst a common pattern of histone modifications is shared. This unique distribution of MBDs is also characterized in ch...

متن کامل

Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients

BACKGROUND AND AIMS Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 23 3  شماره 

صفحات  -

تاریخ انتشار 2007